5 research outputs found

    FedCLIP: Fast Generalization and Personalization for CLIP in Federated Learning

    Full text link
    Federated learning (FL) has emerged as a new paradigm for privacy-preserving computation in recent years. Unfortunately, FL faces two critical challenges that hinder its actual performance: data distribution heterogeneity and high resource costs brought by large foundation models. Specifically, the non-IID data in different clients make existing FL algorithms hard to converge while the high resource costs, including computational and communication costs that increase the deployment difficulty in real-world scenarios. In this paper, we propose an effective yet simple method, named FedCLIP, to achieve fast generalization and personalization for CLIP in federated learning. Concretely, we design an attention-based adapter for the large model, CLIP, and the rest operations merely depend on adapters. Lightweight adapters can make the most use of pretrained model information and ensure models be adaptive for clients in specific tasks. Simultaneously, small-scale operations can mitigate the computational burden and communication burden caused by large models. Extensive experiments are conducted on three datasets with distribution shifts. Qualitative and quantitative results demonstrate that FedCLIP significantly outperforms other baselines (9% overall improvements on PACS) and effectively reduces computational and communication costs (283x faster than FedAVG). Our code will be available at: https://github.com/microsoft/PersonalizedFL.Comment: Accepted by IEEE Data Engineering Bulletin; code is at: https://github.com/microsoft/PersonalizedF

    Frustratingly Easy Model Generalization by Dummy Risk Minimization

    Full text link
    Empirical risk minimization (ERM) is a fundamental machine learning paradigm. However, its generalization ability is limited in various tasks. In this paper, we devise Dummy Risk Minimization (DuRM), a frustratingly easy and general technique to improve the generalization of ERM. DuRM is extremely simple to implement: just enlarging the dimension of the output logits and then optimizing using standard gradient descent. Moreover, we validate the efficacy of DuRM on both theoretical and empirical analysis. Theoretically, we show that DuRM derives greater variance of the gradient, which facilitates model generalization by observing better flat local minima. Empirically, we conduct evaluations of DuRM across different datasets, modalities, and network architectures on diverse tasks, including conventional classification, semantic segmentation, out-of-distribution generalization, adverserial training, and long-tailed recognition. Results demonstrate that DuRM could consistently improve the performance under all tasks with an almost free lunch manner. Furthermore, we show that DuRM is compatible with existing generalization techniques and we discuss possible limitations. We hope that DuRM could trigger new interest in the fundamental research on risk minimization.Comment: Technical report; 22 page

    On the Robustness of ChatGPT: An Adversarial and Out-of-distribution Perspective

    Full text link
    ChatGPT is a recent chatbot service released by OpenAI and is receiving increasing attention over the past few months. While evaluations of various aspects of ChatGPT have been done, its robustness, i.e., the performance to unexpected inputs, is still unclear to the public. Robustness is of particular concern in responsible AI, especially for safety-critical applications. In this paper, we conduct a thorough evaluation of the robustness of ChatGPT from the adversarial and out-of-distribution (OOD) perspective. To do so, we employ the AdvGLUE and ANLI benchmarks to assess adversarial robustness and the Flipkart review and DDXPlus medical diagnosis datasets for OOD evaluation. We select several popular foundation models as baselines. Results show that ChatGPT shows consistent advantages on most adversarial and OOD classification and translation tasks. However, the absolute performance is far from perfection, which suggests that adversarial and OOD robustness remains a significant threat to foundation models. Moreover, ChatGPT shows astounding performance in understanding dialogue-related texts and we find that it tends to provide informal suggestions for medical tasks instead of definitive answers. Finally, we present in-depth discussions of possible research directions.Comment: Technical report; code is at: https://github.com/microsoft/robustlear

    Chinese expert consensus on the diagnosis and treatment of malignant pleural mesothelioma

    No full text
    Abstract Malignant pleural mesothelioma (MPM) is a malignant tumor originating from the pleura, and its incidence has been increasing in recent years. Due to the insidious onset and strong local invasiveness of MPM, most patients are diagnosed in the late stage and early screening and treatment for high‐risk populations are crucial. The treatment of MPM mainly includes surgery, chemotherapy, and radiotherapy. Immunotherapy and electric field therapy have also been applied, leading to further improvements in patient survival. The Mesothelioma Group of the Yangtze River Delta Lung Cancer Cooperation Group (East China LUng caNcer Group, ECLUNG; Youth Committee) developed a national consensus on the clinical diagnosis and treatment of MPM based on existing clinical research evidence and the opinions of national experts. This consensus aims to promote the homogenization and standardization of MPM diagnosis and treatment in China, covering epidemiology, diagnosis, treatment, and follow‐up
    corecore